2 UNIVERSITY OF BRISTOL 
INSTITUTE OF EDUCATION ^ 
Publication Number 4: eM Е Ы E 
PUB US. 2 ns. 
ШОУ МЕТА ТЫ eee 
MEASUREMENT 
„Яй ና 


UNIVERSITY OF LONDON PRESS Lp. 
WARWICK SQUARE, LONDON, | E,C.4 
Two Shillings Net 


аә Жер АҚ > да 


UNIVERSITY OF BRISTOL 
INSTITUTE OF EDUCATION 


Publication Number 4 


MENTAL 
MEASUREMENT 


ру Peş- 
RUTH BOWYER 2827 
M.A., B.Sc.(Econ.), Ed.B. HY 


Department of Psychology 
University of Bristol 


ገያ У, E m 
ከር ንን መሠ A ፖ 4.. ve 
y 9^ мағ 
P * 
P 2 
>» А 
#. жм 
“ы 


UNIVERSITY ОЕ LONDON PRESS Ілр. 
WARWICK SQUARE, LONDON, E.C.4 


4” 


Copyright 1953 
by the University of Bristol 
Institute of Education 


Printed & Bound in England for the UNIVERSITY ОР LONDON Press LTD., 
by HAZELL, WATSON & Viney, LTD., Aylesbury and London 


PREFACE 


The series of booklets published by the University of Bristol 
Institute of Education, of which this is one, is designed to bring to 
the teacher in the classroom the results of recent researches which 
already influence or are likely in the future to influence his work. 
Тһе study of mental measurement is by no means modern, but until 
quite recently its importance could not be seen in clear perspective. 
At first, methods of measurement were empirical; then over- 
enthusiasm resulted both in the production of unscientific tests and 
also in their wide use in situations where scientific rigidity was out 
of place. Now much experience has been gained, and it is possible 
to put its results in the hands of teachers in a way that will enable 
them both to use and to avoid the dangers of a misuse of the ample 
material available in this field. 

The most important aspects of education, because they concern 
the relationship of person with person, fall outside the scope of 
mechanical measurement. Traits of personality or partial aspects 
of personality can be measured, but human personality itself belongs 
toanon-material sphere where measurement is impossible and analysis 
is only destructive. Mental measurement, restricted to its proper 
sphere, can be a very great aid to the teacher; misused, it can assist 
all of those forces at work in the modern world which tend to reduce 
man to a piece of social mechanism. This is the reason why the 
subject of mental measurement should be well understood, so that 
it should never be used to strengthen that, often unconscious, 
materialism and determinism which stultify educational practice, 
but that it should be used to release the creative spirit of childhood. 


CONTENTS 


PREFACE 

I INTRODUCTION 

П INTELLIGENCE TESTS 
ПІ EDUCATIONAL TESTS 
IV APTITUDE TESTS 

V PERSONALITY TESTS 
CONCLUDING REMARKS 
ADDENDUM 
BIBLIOGRAPHY 


PAGE 


1. INTRODUCTION 


Тһе aim of this booklet is to survey for teachers the salient 
features of mental measurement which have established themselves 
in practice. There is no suggestion that such a brief survey should 
replace the technical studies, some of which are listed in the biblio- 
graphy. The booklet is produced as a memorandum of researches 
of the past half-century which have influenced and are influencing 
the work of the schoolroom. 


II. INTELLIGENCE TESTS 
1. Types of Test 


Intelligence tests can be classified broadly into two types, individual 
and group. Individual tests require a face-to-face relationship 
between the examiner and a single child at a time; group tests are 
designed to be given to a number of children together. Each of 
these types may be classified further into verbal, non-verbal, and 
mixed (ie. tests containing both verbal and non-verbal items). 
Non-verbal tests are sub-divided into paper-and-pencil tests (i.e. 
tests requiring only written responses—or answers by means of 
pencil marks, e.g. Porteus Mazes) and performance tests (i.e. tests 
requiring the manipulation of concrete material, as in wooden 
puzzles, e.g. Kohs' block-designs test) Tests are either “аре- 
scales" like the Terman-Merrill or “point scales" like the group 
tests to be mentioned below. 


2. Uses of Intelligence Tests 
Intelligence tests are used in: 


(a) Classification for special schools. 

(b) Selection for different types of secondary school. 

(c) Grading within a school or within a class to secure homo- 
geneous groups for teaching purposes. 

(d) Vocational guidance. 

(e) Child guidance. 


3. Examples of Intelligence Tests 
(a) Individual Tests 


i. The Terman- Merrill or Stanford-Binet (1937) Intelligence Test. 
The most widely used individual test is the Terman-Merrill revision 
of the Binet scale. The last revisiore of this test was completed in 
1937 under the supervision of Professor Terman and Dr. Maud 
Merrill, of Stanford University, California. It is a successor to the 
pioneer intelligence tests devised by Alfred Binet. Special training 
is required for any teacher, psychologist, or doctor using the Terman- 


5 


6 MENTAL MEASUREMENT 


Merrill, but all teachers need to know something of its nature and 
origin in order to understand: 


(a) when to refer a child for an individual intelligence test; 
(b) the significance of the results; 


(c) concepts such as mental age, norms, standardisation, I.Q. 


ii. The Binet-Simon Test. In 1904 the French minister of Public 
Instruction appointed a commission to make recommendations for 
the special education of feeble-minded children in the Paris schools. 
As a member of that commission, Alfred Binet, with the collabora- 
tion of Theophile Simon, devised a test, which consisted of thirty 
questions or tasks arranged in order of difficulty. The order of 
difficulty was defermined by asking fifty normal children of different 
ages these questions. 

In Binet’s second scale (1908) the items were arranged in age- 
groups. Whenever an item was passed by 60-90 per cent. of a 
given age-group it was considered suitable for that age. For 
example, if most 5-year-old children failed at a task, it was moved 
up to the 6-year-old group of questions, where it remained if 60-90 
per cent. of 6-year-old children answered it. If more than 90 per 
cent. of 5-year-old children answered a question, it was moved down 
to be attempted by the 4-year-olds, and so on. 

Norms. That part of test construction which consists in finding 
for each item the age at which it is normal to answer it successfully 
is included under the term standardisation. These average scores 
or normal responses at different levels are the criteria against which 
an individual will be measured, and are known as the Norms. | 

Since Binet’s day, more rigid statistical requirements are observed 
both as regards the size and the representative nature of the sample 
of population from which the norms are obtained. An account of 
the 10 years’ work of standardising the 1937 Terman-Merrill is given 
in the introduction to “Measuring Intelligence” (by Terman and 
Merrill). Іп the present version of the test, the position on the 
scale of each item has been determined as follows. If half of the 
children aged 10 pass the test, it is placed at the 10-year-old level. 
Similarly, if half of the children aged 11 pass an item, it is placed at 
the 11-year-old level, and so on. г 
а Mental Age. Thus, if a child passes all the items up to and 
including those placed at age 11, his stage of raental development is 
equivalent to that of an average 11-уеаг-о14 child. Such a child is 
said to have a mental age of 11 years. Similarly, if he passes all the 
tests up to and including those placed at age 12, his mental age will 
be 12 years. Thus, passing the group of tests placed at age 12 (six 
in number) has raised his mertal age by one year, and the passing of 
each test is equivalent to the addition of two months of mental age. 

From the number of tests passed, therefore, it is possible to cal- 
culate a child's equivalent mental age. 


Binet's final scale published in 1911 had a range of questions from 


MENTAL MEASUREMENT 7 


3 years up to age 15. The 1937 Terman-Merrill begins at mental 
age 2 and goes up to superior adult levels. There are six questions 
for each year-group from year 5 to average adult, making each 
question equivalent to a score of two months. Before year 5 there 
are six questions for each half-year, since there are noticeable 
changes from month to month in such young children. 

The questions in the Terman-Merrill for age 2, 23, 3 years, etc., 
are intended for older children who cannot answer all the questions 
for their own age, which may be 5 or 6 years. There are other 
individual tests specially designed for the pre-school child, e.g. 
Gesell's Developmental Norms, and the Merrill-Palmer test. Dr. 
Ruth Griffiths (London) is at present standardising a test which can 
be administered to infants as young as 3 weeks. Her intention is 
that the test can be used up to 5 years, when it should give place to 
the Terman-Merrill. Professor Gilliland (Evanston, Ohio) has 
recently brought out a test for infants of 4-12 weeks. 

The following is a sample of tasks in the Terman-Merrill: 


Year 5 

. Picture completion: man (2 points required). 

. Folding triangle (paper, tester demonstrates first). 

. Definitions (ball, hat, stove). 

. Drawing a square (copying from a diagram). 

Sentence memory (repeating sentence which tester has finished). 
. Counting four objects. 


Although the test contains tasks of the nature of a performance 
test, such as paper-folding, drawing designs from memory, and bead- 
stringing, nevertheless, it must be regarded mainly as a verbal test, 
because it is carried out by means of personal conversation between 
tester and child. It is, in fact, a standardised interview, and much of 
the skill of the trained and experienced tester lies in making the test 
seem like a spontaneous and happy conversation, while keeping to 
the exact words and standard method of administration. Тһе test 
puts a premium on both the understanding and the use of spoken 
language. As this is the chief medium of instruction in our ordinary 
schools, the test is valuable in predicting academic success. 


олом 


Intelligence Quotient 

Stern introduced the method of expressing the score made in an 
intelligence test,as a ratio between mental age and chronological age. 
А child with mental age 8 years and a chronological age 8 years 
has a ratio of ability to age of 8/8 or unity. His stage of mental 
development is the same as that of the average child of his age. А 
child with mental age 8 years and chronological age 10 years has a 
ratio of ability 8/10. То avoid fractions, the ratio is multiplied by 
100. It is called an Intelligence Quotient (І.О.). In the examples 
just given, the first child has an I.Q. of 100 (average) and the second 
child has an I.Q. of 80 (dull). 


8 MENTAL MEASUREMENT 


The benefit of expressing the test scores as an intelligence quotient 
is that within limits and for most children the intelligence quotient 
remains constant, and a knowledge of its value is thus helpful for 
prognostic purposes. 4 

The amount of retardation, on the other hand, tends (о increase 
with age. Thus, for example, a boy aged 6 who is two years 
retarded has a mental age of 4 and an T.Q. of 66, and at the age of 12 
his most probable I.Q. will be 66 and his mental age 8. In other 
words, he is now four years retarded. 

A child who is two years behind at age 12, on the other hand, has 
a mental age of 10 and an 1.0). of 83. At age 6 his mental age was 
most likely 5 and his retardation one year. 

Vernon quotes the following example to drive home the point that 
mental age’ gives the present amount of intelligence and I.Q. gives 
the rate of development: 


Child A. С.А. 9:0; M.A. 11:0; 1.0. 122. 
Child В. С.А. 8:0; М.А. 10:6: I.Q. 131. 


The first child (A), with a mental age of 11, is actually at a higher 
level of intelligence at present and should be expected to do more 
advanced school work. Тһе younger child (B), with a mental age 
of 103, has a quicker rate of. development and may be expected to 
overtake A in a few years, other things (e.g. character) being equal. 
It depends on one's purposes whether one is more interested in mental 
age or intelligence quotient; usually both have to be considered. 


The distribution of intelligence in the general population has been 
estimated as follows: 


8 б " cent. о, 
1.0. Classification "e ри 24 
140 and above Very superior 04 
120-139 Superior or very bright 88 
110-119 Bright 15:9 
90-109 Average or normal 49:8 
80-89 Dull normal (or backward) 15:9 
70-79 Border-line, very dull 8:8 


0-70 Feeble-minded 0:4 


Alternative Ways of expressing intellectual level as measured by 
tests are percentile Scores and standard scores, To say that a 
person has a percentile rank of 95 is to indicate that he is among the 
brightest five per cent. of the population, provided that the test has 
been standardised on a representative sample ofthe population. А 
standard score (also called "sigma" score) expresses the extent by 
which an individual deviates from the mean, in terms of the standard 
deviation, assuming that a!l the Scores, when plotted graphically, 
form a normal probability curve, Ап account of these measures 15 
given in Guilford's Psychometric Methods and in Freeman's Mental 
Tests. In the introduction to Measu 


tsuring Intelligence, Terman and 
Merrill state that they have retained the custom of expressing 


MENTAL MEASUREMENT 9 


results іп I.Q.s because many teachers and others who use test 
results have built up from experience a knowledge of what the 
different І.О. levels mean. Tests designed for adults, however, 
express results more often in terms of percentile rank. This is 
necessary because to secure І.О. a fictitious chronological age has to 
beadopted as the divisor, based on the view that intelligence is mature 
in most people by the time they reach adolescence. Study of scores at 
Moray House, Edinburgh, has tended to show that what is measured 
by intelligence tests goes on growing until at least 20-21 years. 

When it is said that intelligence is mature by 15 or 20, this is some- 
times misinterpreted to mean that intelligence “stops growing” then, 
in the sense that no more can be learned, and that ап average youth 
of 20 is as intelligent as the average man of 30. Only if the youth 
and man are confronted with a problem which is entirely new to 
both, i.e. where differences of experience play no part, will their 
chances of solution be equal. Such problems may be comparatively 
rare in everyday life. 


Constancy of I.Q. 

While relatively constant for normal children, the I.Q. tends to 
increase with super-normal individuals and to decrease with sub- 
normals. The limits of development are reached earlier with dull 
children than with average and superior individuals. This is some- 
times represented in a diagram which shows also the relatively quick 
development in the early years, and the possibility that, after 
reaching maturity and remaining stable for several decades, intel- 
lectual ability begins to decline. Experiments on the effect of age on 
skill have been reported from Cambridge by Welford and others. 
An account of earlier work may be found in Bird's Social Psychology. 


200 
SUPERIOR 


AVERAGE 


yearsofag@ 5 10 15 20 25 30 35 40 45 9 


E 
Although in general the I.Q. remains relatively constant, i.e. it does 
not vary by more than a few points when measured at different times, 
nevertheless, emotional disturbances may upset the measurement. 
Young children (under 6 or 7 years) are difficult to test because 
shyness, fear, or distractability can prevent them from showing their 
best efforts. Before the age of 7 years there may be uneven devel- 
opment, with slowness in one or more directions, e.g. language, 
which may be quickly made up later. Temporary backwardness in 
м.м.-1% A 


10 MENTAL MEASUREMENT 


one or more directions seems to be bound up with emotional attitudes 
and individual interests. 


Other Individual Tests of Intelligence 


A useful individual test for young children is the Draw-a-Man test 
by Florence Goodenough (Measurement of Intelligence by Drawing). 
Temperamental children who will not concentrate on the Binet type 
of test (either the Terman-Merrill or the Merrill-Palmer) are often 
quite happy to draw a man, and the characteristics of such a drawing 
may be scored and used to give an approximate mental age. 

Adolescents are another age-group who do not test so reliably as 
children do who are in the middle years of childhood. The reasons 
seem to be (a) heightened emotionality, and (Б) the greater relative 
weight which the development of special interests has on the indi- 
vidual’s total performance. 

Several tests have been used in recent years, instead of the Terman- 
Merrill, for adolescents and adults. The Wechsler-Bellevue has a 
verbal scale (e.g. information, comprehension, arithmetic) com- 
bined with performance items (e.g. object assembly, picture arrange- 
ment). It was designed partly to meet the criticism that the Terman- 
Merrill contained material which is unsuitable for those who have 
left school, and also because the authors believed in a different 
approach (see The Measurement of Adult Intelligence, by David 
Wechsler). The Shipley-Hartford is а test in two parts (vocabulary 
and abstraction) which takes only twenty minutes to administer. 
The abstraction score may be compared with the score on vocabu- 
lary to ascertain if there has been deterioration (e.g. because of ill- 
ness). This is possible because it has been established that one’s 
vocabulary level remains fairly constant, once intellectual maturity 
has been reached, in spite of vicissitudes which may affect powers of 
abstract thinking. The Kent Oral Norms is a short test which is 


useful in an emergency where it is necessary to get an approximate 
indication of mental age. 


At any age it is 
temperamental peculiarit 
gifted teachers who һауе stimulated individuals to show powers 
which were unsuspected before, after which a higher ፲.ርን. might be ob- 

io i n prevent an individual 
from showing ability which he seemed to have.on another occasion, 
with the result that a lower L.Q. might be obtained. Accident or 
illness may impair intelligence temporarily or permanently. With 
the above provisos it has until recently seemed to be generally true 
that the I.Q. remains fairly constant іп the sense that a naturally 
bright person remains bright, an average person remains average, 
and a subnormal person subnormal, within insignificant limits of 
daily fluctuation. Most recent evidence, however, seems to suggest 
that the variation may be considerable, and this has reopened the 
question of constancy of I.Q., on which more research is required. 


MENTAL MEASUREMENT 11 


In suggesting more accurate attention to Terman's cautious state- 
ments on probable constancy of Г.О. (in his Measurement of 
Intelligence, 1916, p. 68), Goodenough points out that, on the basis 
of Terman's figures, in an average school of 500 children we should 
expect to find changes in I.Q., as great as 20 points in about nine or 
ten of the subjects on ፲ፎ-1681. Nor is it always necessary to invoke 
special causes for І.О. change in individual cases (Goodenough, 
Mental Testing, p. 165). ነ 5 | 


Performance Tests N 

The first individual intelligence scale not 1 ing the use of 
language was probably that of Pintner and Paterson, published in 
1917. It is a series of wooden puzzles, including Form Boards 
(boards of ply-wood with cut-out spaces of different design into 
which shapes haveto befitted). Different batteries make common use 
of several well-known single tests, e.g. the Kohs' Block Design Test. 
Thus the authors of a performance scale are often making a new 
arrangement and standardisation of existing tests, with perhaps one 
new sub-test of their own. 

In 1924, Collins and Drever produced a battery to test deaf 
children. It contains, among other items, Kohs' Blocks, Cube 
Construction, and several Form Boards, with a new test which 
consists of dominoes used to measure rote memory for numbers. 

Alexander's Battery (1936) consists of three sub-tests, viz. Cube 
Construction, Kohs' Blocks, and his own test, the Passalong. The 
latter is a series of puzzles where blocks in small wooden trays (or 
shallow boxes) have to be moved into position to form given 
designs without any block being lifted out of a tray. 


The Need for Performance Tests 
Performance tests are used: 


I. To test those who are verbally handicapped, by reason of deaf- 
ness, speech impediment, illiteracy, special weakness in lan- 
guage, or unfamiliarity with the tester's language. 

2. To test those whose general intelligence seems likely to be over- 
rated because of special language facility. 

3. To test those of difficult temperament, who often become 
interested in performance tests although they may refuse to 
respond to verbal tests. 

4. To measure special practical ability. 


Paper-and-pencil Non-Verbal Tests 

‚ Raven's progressive Matrices (1938) is а non-verbal test of a 
visual pattern which can be used either as an individual or a group test, 
for ages 7 to 65. It is, however, rather difficult for children younger 
than 11. There is a matrices test specially for children (1948), 
Raven has also designed a matrices test for higher intelligence levels, 


12 MENTAL MEASUREMENT 


Another individual non-verbal test is a series of mazes designed 
by-Porteus, of increasing difficulty, for children ages 6-14. 


(b) Group Tests 

The occasion which led to the development of the paper-and- 
pencil group test was the entry of U.S.A. into the First World War 
in 1917. During 1917-18 approximately 1,750,000 men were given 
tests devised by a committee of five psychologists who were specialists 
in the field of mental measurement (now frequently called psycho- 
metrics). These psychologists were Thorndike, Terman, Haggerty, 
Whipple, and Yerkes, who later adapted the American Army tests 
for English and American school purposes, under the name National 
Intelligence Tests (ages 8-13) (University of London Press). : 

In devising the American Army tests, all previous test material 
was considered, and also an unpublished test by Otis. Norms 
representative of the total population were obtained by giving the 
tests to primary and secondary schoolchildren, large groups of 
college students, men in officers’ training camps, over 5,000 enlisted 
men, and inmates of institutions for the feeble-minded. 

There were two forms of the American Army tests: the Alpha for 
the literate, and the Beta (a non-verbal test) for the illiterate and 
foreigners. The success of the tests was established by the suitable 
allocation of men in the services as predicted by the tests and by a 
post-war follow-up study which found, generally speaking, that 
those with highest test scores held the better civilian posts. The use 
of intelligence tests was stimulated, and from that time the features 
of good intelligence tests, some of which were explicitly formulated 


by the Committee before it started to work, have become increasingly 
recognised, These are: 


. Validity. 
- Consistency (or reliability). 
- Adequate standardisation. 


. Material independent of Specific information or schooling (as 

far as possible). 

Scoring which is rapid, simple, and objective. 

- Grading in difficulty, such that those of low ability may 
attempt enough to give them a Score, and those of high ability 
may be stretched sufficiently to show their limits. 

7. 'The minimum of written 


An шю 


10. Sufficiently interesting material to give incentive to those 
tested. Where the material has been found to be less inter- 
esting to one sex than to the other, separate norms are given 
for girls and for boys. 


~ 


MENTAL MEASUREMENT 13 


11. Different forms of the same test, equal in difficulty and stan- 
dardisation, in case a re-test is necessary. 


The following is a note on validity and consistency, the other 
features being self-explanatory. 

Validity. Тһе prime consideration of any test is its validity, i.e. 
the extent to which the test measures what it claims to measure. 
The purpose of the Binet tests, for example, is to estimate the ability 
of the children to succeed in the ordinary schools. Therefore the 
validity of the test is judged by following up the results of school 
work, as measured by examination marks and teachers’ estimates. 
The purpose of the army tests was to select those men most suitable 
for the different grades of work, and so the criterion of validity was 
agreement of the scores with officers’ estimates, promotions, and, 
after demobilisation, the agreement with success or failure at re- 
sponsible work. Тһе agreement of those external criteria (school 
marks, employers' estimates) with the test scores can be expressed 
statistically as a coefficient of correlation. Perfect agreement is 
expressed аз 1. The validity of the best intelligence tests is imperfect, 
but has been found to be higher than the validity of predictions made 
from impressions unaided by tests. 

Consistency. Reliability or consistency is the degree to which a 
test gives a similar result on re-test, so that it is a stable measure. 
In spite of a certain amount of day-to-day or hourly fluctuation in the 
level of performance of everyone, tester and tested alike, most intelli- 
gence tests in use have a reliability coefficient varying from 6 to -9, 

An account of a test’s standardisation, validity, and consistency 
is usually given in the manual of directions published with the test. 
Other features may be seen by looking at the specimen set which 
most publishers can provide. There is a useful booklet, What are 
the Moray House Tests?, published by the University of London 
Press in connection with the Moray House group tests for 11-years- 
plus. 

In reporting a test result the name of the test is given, because 
different tests have different dispersions of scores. An I.Q. of 150 
on a Cattell test, for example, is the equivalent of a somewhat lower 
score on a Moray House Test. A group test is a cruder instrument 
than an individual test, although the former serves reasonably well 
to place people in, rank order. A group test is less trustworthy 
because one has not “the feel” of the child, which in an individual 
test allows one to remove fear or misunderstanding, and to take 
nervousness into account when interpreting the score. Unexpected 
Tesults from a group test, i.e. results which conflict with other 
evidence, are usually checked against an individual test. 

Types of Question in Group Tests of Intelligence. Where the 
question or short problem is followed by a blank space, the answer 
is to be written in the space. Where alternative answers ate given, 
the correct one is to be underlined. 


14 MENTAL MEASUREMENT 


1. Synonyms (a) Blank 
e.g. Charity means the same as... 
(b) Choice 3 
e.g. Diminish means the same as continue, 
lessen, demand, release, find. 
2. Opposites (a) Blank Я 
e.g. Bold is the opposite of . . . 
(b) Choice 
e.g. Beyond is the opposite of here, there, 
within, nearby, including. 
3. Classification (а) Word 
i. Underline the one which does not belong. 
Cat, dog, rat, owl, horse. 
ii. Choice. (Underline the word in brackets 
which belongs with the first three words.) 
Father, mother, aunt (friend, sister, post- 
man, teacher), 
(6) Geometric figures 
e.g. Cattell II and III. 


4. Analogies i. Man is to boy as woman is to... 
ii. 9isto6aspisto... 

5. Sentence 

Completion e.g... . anyone could move, the tree had fallen. 
6. Number series eg. 3579 ——— 
7. Vocabulary e.g. The place where a car is kept is called а... 
8. Mixed 

Sentences e.g. True bought cannot friendship be. 
9. Following 

Instructions 


e.g. If M comes before N write G unless P comes 
after R, in which case write Z. 

10, Rhymes Underline the rhyming words, 

е.в. Book, hill, cock, box, cook. 


(a) For young children: Gesell's Development, Norms (found in 
his book, The First Five Years). 

Intelligence Tests for Children, by Valen- 

tine (Methuen). 

Draw-a-Man Test, by Goodenough (see 

Measurement of Intelligence by Draw- 
ings). 

The Merrill-Palmer Test (Mental Mea- 


surement of Pre-School Children, by 
Rachel Stutsman). 


MENTAL MEASUREMENT 15 


(b) For ages 8-11: The Terman-Merrill (Test Material from 
Messrs. Harrap and from Messrs. 
Baird). Can be used also for younger 
and older ages. 
(c) For adolescents and The Wechsler-Bellevue. 
adults: Kent Oral Emergency Norms. 
Raven’s Progressive Matrices (also suitable 
for group testing). 
Porteus Mazes (ages 6-14). 
Shipley-Hartford. 


Group Tests 
(a) For young children: Moray House Picture Test, by Margaret 
Mellone (U.L.P.). 
(b) For ages 8-11: Junior Simplex, by Richardson—a verbal 
group test (Harraps). 
Sleight (non-verbal). 
Otis (mixed). Published in older age- 
ranges also. 
Moray House Tests (for ages 9+ and 11 ++). 
(c) For adolescents and Cattell’s Intelligence Tests. 
adults: AHS, Test for superior intelligence levels. 
Stephenson’s GVK. 
Raven's Progressive Matrices, which can 
be used in conjunction with the Mill Hill 
Vocabulary Test. 


III. EDUCATIONAL TESTS 


Educational tests may be called tests of attainment, achievement, 
or proficiency. They are designed to measure an individual's 
proficiency in a particular area. They are intended to tell what the 
person can do in that area or subject at the time when he is tested. 

Tests of attainment or achievement in any of the school subjects 
may take the form of: 


1. The traditional written examination, now called the “Essay- 
type" to distinguish it from 2 and 3. 

2. Тһе objective-type examination. 

3. Standardised tests. 


Detailed discussion of the traditional examination, its faults 
and advantages, with suggestions for improving its validity and 
reliability, may be found in Chapter 11 of Vernon's Measurement of 
Abilities. 

The objective-type examination is the kind of test where a large 
number of short questions are set instead of three or four longer 
ones. This form of test avoids at least two of the charges brought 
against the traditional examination, viz. inadequate sampling of the 
pupil’s field of knowledge, and unreliability of marking due to 


16 


MENTAL MEASUREMENT 


the subjective nature of the examiner's opinion as to the merit of 
the answer. d wes T : 

The questions in the objective-type examination may take on 
of several recognised forms: 


A. 


Simple Recall and Open-Completion Type 
A question or short problem is followed by a blank space where 
the answer is to be written. Alternatively, certain words or 
Short phrases are omitted from a sentence or paragraph, 
leaving blank spaces to be filled in. For example: 

la. Who was the inventor of the railway engine? 

15. The inventor of the railway engine was... 


. True-False Type 


This consists of a set of statements, approximately half of 
which are true, the rest false. The examinee has to indicate 
which is which by underlining True or False. 


1. She Stoops to Conquer was written by Sheridan. True. 
False. 


2. Scott, the explorer, reached the South Pole. True. False. 


- Multiple-choice Type, including Best Reason and Matching 


Items. 


1. Put a cross opposite two of the following English painters 
who are specially famous for their landscapes: 


Constable X Reynolds— 


Hogarth— Romney— 
Lawrence— Turner X 

2. Samuel Pepys is best known as the author of: 
Fiction— Plays— 
Poetry— Sermons— 


None of these X 


3. Identify the following orchestral instruments by drawing a 


circle round S for strings, W for woodwind, B for brass, 
and P for percussion: 


Clarinet S(W)B P 
Triangle S W B(P) 
Double Bass (5 ዝ ከ 
English horn S(W)B Pe 
French horn S W(B)P » 


. Best Reason—or Worst 


Many investigations indicate that children's intelligence is 
somewhat affected. by environment and education. Which of 
the following provides the poorest evidence for this conclusion? 


The correlation between the intelligence of orphans 
and their parents is lower than that between children 
and their parents who have reared Нені... 


MENTAL MEASUREMENT 17 


The average difference between the I.Q.s of pairs of 
identical twins is higher among those reared apart 
than among those reared together. ... 

Children tested before placement in foster-homes, and 
again after several years, show a greater increase in 
intelligence in the better than in the poorer homes. . . . 
Children of professional parents are on the average 
markedly superior in intelligence to children of un- 
skilled labourers. . .. X... 


E. Rearrangement Type 


(а) Weaving 1. d 
(b) Spinning 2. ፅ 
(с) Dyeing SEG M 
(d) Carding 4. а x 


The objective-type examination is a useful supplement to the tr 
tional examination, leaving the latter free from the necessity to test the 
range of factual knowledge, so that it can be used to measure under- 
standing of underlying principles, or originality, or power to develop 
anargument. Validity of examinations is increased when examiners 
decide beforehand what they want to measure, and the relative 
weight to be attached to each aspect of the work, e.g. so much for 
| matter and so much for style. С 4 
| The composing of good objective-type examinations perhaps in- 
| 


volves more work than in the case of the traditional examination. 
On the other hand, the marking of the papers, where these are many, 
| is less arduous because there is no need to deliberate on the quality 
| of the answer, which is a matter of correct or incorrect factual 
| information to be marked right or wrong. Each correct response 
| Scores one point. 
| 


Standardised Tests of Attainment 


| Tests of attainment in school subjects are standardised in the 
same way as intelligence tests. The items are given to a representa- 
tive sample of children, and the age at which it is normal to pass each 
ከ З item is ascertained. . 
The Northumberland and the Moray House tests of English and 
Arithmetic are paper-and-pencil group tests 1n booklet form which 
are useful in conjunction with intelligence tests and school reports as 
aids to selection for secondary schools. There are also Moray 
House English and Arithmetic tests for children aged 9+. Е 
In the junior school, useful tests аге Burt's five-minute tests in 


18 MENTAL MEASUREMENT 


each of the four arithmetical processes, and a spelling test by Burt 
(see Mental and Scholastic Tests). 2 

There are several kinds of reading test, e.g. Word-recognition 
reading test by Burt (U.L.P.) and Word-recognition reading test by 
Vernon. These two reading tests may be bought on cards at 44. 
each (U.L.P.) and are also contained in a book, The Standardisation 
of a Graded Word Reading Test (price 1s.) by Vernon. For testing 
silent reading comprehension, tests by Schonell are available. 

For young children who cannot yet do Burt's written arithmetic 
tests, there is a one-minute oral adding and subtracting test by 
Ballard (printed on small sheets sold in packets of 100 ከሃ U.L.P.). 


Educational Ages 


Attainment tests can be used to find a child's educational age in 
arithmetic or in reading, as the case may be. ፲ከ15 means, for 
example, that if he has answered correctly all the sums which were 
placed at age 10, and half of the sums placed at 11, and no more, 
then his stage of attainment in arithmetic is equivalent to that of 8 
child of 103 years. Не may be said to have an arithmetic age of 103 
years. The Moray House tests give tables for converting the raw 
score of a child whose age on the day of the test is known into a 
standardised score or quotient (English quotient, Arithmetic 
quotient). These quotients have an average value of 100, and a 
“standard deviation” of 15 points to agree in this with 1.Q.s obtained 
by a Binet individual examination (see What are Moray House 
Tests?, by Professor Godfrey Thomson). 

It is useful to draw a graph or profile of a child’s scores in different 
subjects, so that his relatively strong and weak subjects may be seen 
ata glance. 

Standardised attainment tests can be used also for measuring 
progress. This means that one might give an arithmetic test, for 
example, to a 9-year-old child on one occasion and find that he had 
an arithmetic age of 8 years 2 months, and give the test again five 
months later. On the second occasion he might be found to have 
an arithmetic age of 8 years 10 months, showing that his backward- 
ness in that subject was now less (an improvement of three months). 
Practice (i.e. the fact of having done the test on a previous occasion) 


does not seem to falsify the results if the re-tes; i 
( -tests аге т ап 
interval of at least three months. d а 


Diagnostic Tests 


When attainment tests bring out a discrepancy between a child's 
educational age in one or more subjects and the rest of his attain- 
ments, or when his attainmen* ages are seen to be less than his mental 
age, diagnostic tests may be used in an endeavour to find the Sources 
of weakness. Diagnostic tests are tests designed to give an oppor- 
tunity for the child to show exactly at what stage or in what aspect 
of a subject or branch of a subject he is encountering difficulty. 


MENTAL MEASUREMENT 19 


Complex items are broken down into their simpler components, 
e.g. if a child made a poor score in long division, a diagnostic test 
might examine his ability to do simple adding and subtracting. 
Such tests are exploratory and qualitative rather than measuring 
instruments for giving a quantitative result. They help to reveal 
the kind of error made by the child, so that remedial work can be 
directed to the particular points requiring attention. 

Schonell's 100 number facts in addition, subtraction, and multipli- 
cation are specially useful for discovering which number combinations 
and which processes are giving most difficulty. Although these 
facts are combinations of only two figures, e.g.: 


7 5 9 3 
ር IUE O 


they are useful at almost any age to disclose any weakness in early 
grounding, specially as regards multiplication tables, e.g.: 


3 6 8 
x9 x4 x0 


Where lack of speed is the difficulty rather than error, the difficult 
combinations may be observed by watching the child working at a 
page of these number facts. For older children there are also the 
diagnostic tests in arithmetic and in English, published in booklet 
form, by Schonell. A pamphlet helpful in diagnosing the source 
of difficulty in reading has been published by Fife Education Com- 
mittee (Some Reading Deficiencies and Their Remedies, by G. Mac- 
gregor, 1933). 


IV. APTITUDE TESTS 


Tests of specific aptitude are of interest in vocational guidance and 
in choice of special direction for higher education. Aptitude tests are 
tests which endeavour to measure а person's natural talent or ability 
for a subject before he has had any special training in that subject. 

The aptitudes for which standardised tests have been made are 
music, art, mechanical ability, manual dexterity, and clerical work. 
There are also tests made for specific jobs, after analysis has been 
made of the work. These specific tests may be either analytic or 
analogous. In the former case the work has been considered from 
the point of view of the separate abilities involved. Biscuit-packing, 
for example, is found to involve, among other things, finger dexterity, 
ability to estimate size, and to appreciate similarities or differences of 
pattern. Therefore a battery of tests to assess these separate skills 


ወም” ; Se» 


20 MENTAL MEASUREMENT 


can be used to assess aptitude for biscuit-packing. An analogous 
test, on the other hand, is one where the task is similar to the actual 
job. An example of such a test is the one devised by May Smith for 
laundry work, where paper shapes of garments with different laundry 
marks on them have to be sorted. 

| Seashore's Tests of musical ability and appreciation, presented on 
gramophone records, have been adapted and restandardised in 
England by Dr. Wing, City of Sheffield Training College. One of 


these tests is analytic (measuring sense of pitch, rhythm, acuity, etc., . 


separately) and another (the Oregon test) is analogous, the test being 
to compare pieces of music. 

Alexander's battery (Passalong, Kohs’ Blocks, cube construction) 
gives a measure of practical ability or aptitude for the workshop 
subjects of technical schools. 

Examples of mechanical-aptitude tests are those of Bennet and 
Fry, and tests published by the Institute of Industrial Psychology, 
€.g. Cox's tests (using diagrams and models) and Vincent's mech- 
anical models. Cox's nailboard, nailstick, and eyelet board are 
tests of manual dexterity. А more recent test of manual dexterity 
is Cockett's Peg-board, used in the R.A.F. 


V. PERSONALITY TESTS 


The attempt to measure personality has been, so far, much less 
Successful than the assessment of intelligence. This is not surpris- 
ing, since intelligence is only one aspect of personality, which is very 
much more complex, although intelligence itself is complex enough. 

‚ There is controversy as to the components of personality. Defini- 
tions will be found in Allport's book, Personality. Tests approach 
the task of measurement from various aspects, assessing traits of 
temperament, disposition, attitude, or interests. Sometimes the 
term “orectic tests" is given to such tests, implying that they are 


assessing emotion plus conation, the affective and goal-seeking 
aspects of mind. 


Two broad lines of approach may be distinguished: 


А. The measurement of traits, 
B. The study of types. 


In contrast with the quantitative approach is that of seeking to 
understand the unique pattern of the individual as а whole. His 
approximation to a recognised type may be described. This is a 
qualitative method, on the whole, although it may involve a certain 
amount of counting, as in scoring the responses of projection tests 
like the Rorschach to be described below. Projection tests are 
designed to stimulate the individual to reveal his inner, often un- 
conscious, motives. 

It has become traditional to regard the quantitative method 
(measurement of traits) as typically American, and the qualitative 
(descriptive) approach as German or Continental. British psycho- 


oh 


MENTAL MEASUREMENT 21 


logists have come to regard both methods as complementary. They 
have contributed statistical techniques and a special interest in 
improving the interview as a method of assessing personality. 


Quantitative Methods 

1. Rating and Ranking 

Rating means the giving of a mark or score representing much or 
little or a moderate amount of the trait in question. Ranking, as the 
name implies, means placing a person in order of precedence in the 
population or in whatever group he is being considered as a member. 

Graphic-Scales. One way of rating a pupil for any characteristic, 
e.g. leadership, is to mark his position on a line with descriptive 
phrases at different steps, e.g. : 


- Poor. | Fair. Good. Very good. Excellent. 


Sometimes a question introduces the rating, e.g. (from Laud's 
Personal Inventory): 


In social conversation how have you been? 


Talkative. Ап easy Talked Preferred Refrained | 
talker. when listening. from 
necessary. talking. 


Errors to be guarded against when rating any individual are the Halo 
effect, Logical Error, and Error of Central Tendency. 


i. The Halo Effect. This tendency was noted by Wells in 1907 
as a constant error to which all judges are liable. It was given the 
name “halo effect” by Thorndike in 1920. The error takes two forms: 


(а) There is a tendency to move the rating of any quality in the 
direction of the general impression which one has formed previously 
of the individual, If one has formed a good impression of a person, 
the tendency is to be prejudiced in one’s expectations of the new 
piece of behaviour or work or trait, and to view it more highly 
accordingly. Similarly, bias can be produced (in the opposite 
direction) if one has formed a bad impression. 1 

(b) There is a tendency for one rating to affect subsequent ones, 1.6. 
to mark traits always above or below the mean. То counteract this 
tendency the positions of high and low are sometimes reversed, e.g. : 


Neatness. / / / 
Very tidy Moderately Very untidy and 
and neat. meat. slovenly. 
Punctuality. july [| 


ደ 
Always late. Moderately Always on 
punctual. j DM 
e; 
፲,ቆ' ነሬ - 
«5.0 W ow 
кез 222 
(920 y 


ራ 
= ፎ=መፎሄውሙ ^ ኳና Фе... 
E ው ተጨ... 


22 MENTAL MEASUREMENT 


ii. Logical Error. This error is the tendency to give similar 
ratings for traits which in the mind of the examiners are related. 
For example, if an individual is rated highly for politeness, he may 
tend to be rated highly also for unselfishness, or if he has bright eyes 
he may be rated bright in intelligence, without other evidence than 
“logical presuppositions in the minds of the raters.” Тһе validity 
of the ratings should always be checked against objectively observed 
behaviour, i.e. the rater ought to be able to cite examples of observed 
conduct from which he formed his judgment of rating (or score). | 

ii. Error of Central Tendency. This is the tendency to avoid 
extreme judgments, and so to displace individuals in the direction 
of the average of the group. One way of counteracting this ten- 
dency is to increase the number of steps, e.g. to use a seven-point 
scale where a five-point scale is wanted, thus ensuring that the five 
points of the scale are more frequently used. However, the best 
safeguard against these tendencies is to be aware of them. Д 

The Man-to-man Scale. A technique which combines ranking 
and rating is the man-to-man scale. 11 was first used in the armed 
forces, but is readily adaptable to school use. Senior officers were 
asked to write down the names of a dozen or so men well known to 
them, and to arrange the names in rank order for a given trait, e.g. 
leadership. The qualities constituting the trait were first defined 
(initiative, tact, ability to secure co-operation). Five landmarks 
were located in the list, viz. the top, the bottom, and the middle man, 
and the two men midway between the extremes and the middle. 
T these five men marks were allotted on a scale with regular steps, 
thus: 


Highest Lieut. J. 15 


High Capt. W. 12 
Middle Lieut. S. 9 
Low Lieut. M. 6 
Lowest Lieut. X. 3 


By reference to this scale, any individual who came up for rating 
could be given his appropriate score, e.g. Captain B. 13, by com- 
paring his behaviour with that of the five men whose behaviour was 
well remembered and had been scored. 


2. Standardised Tests of Personality a, 

Paper-and-pencil tests of the questionnaire type seek information 
about attitudes and traits, such as sense of humour, judgment in 
social situations, radicalism and conservatism. Two well-known 
American tests are the Bernreuter Personality Inventory and the 
Pressey X—O (pronounced *cross-out"). The former, by means of 
125 questions, measures individual differences in four traits: Intro- 
version-extraversion, Dominance-submission, Self-sufficiency, and 
Neurotic Tendency. The norms are given in terms of percentile 
rank. In doing the test a circle is drawn round one of the three 


MENTAL MEASUREMENT 23 


alternative responses: Yes, No, ?. The interrogation mark signifies 
that the subject is not able to give a definite answer perhaps because 
his answer would be "sometimes," or because it depends on circum- 
stances more complex than сап be indicated by “Yes” or “No,” e.g. : 


5. Yes No ? Can you stand criticism without being hurt? 
121. Yes No ? Do you like to be with people a great deal? 


The Pressey X—O consists of three lists of 25 groups of items, e.g. in 
Test I: 

1. begging smoking flirting spitting giggling 

2. fear anger suspicion laziness contempt 

25. teasing insanity flunking vomiting borrowing 


The subject (i.e. the person undergoing the test) is asked to cross out 
everything that he thinks is wrong—everything that he thinks a 
person is to be blamed for. In Test II the subject crosses out every- 
thing about which he has ever worried, or felt nervous or anxious. 
In Test III he crosses out everything he likes or which interests him, 


e.g.: 
6. talking elocution acrobats minstrels smoking 


17. French Drawing English History Science .— 
20. business-men salesmen nurses teachers soldiers |5) 


This test has been restandardised for British children by Рагу 3: 
Collins. Descriptions of other tests such as the Allport- 
Study of Values and the George Washington Test of Social 
ligence (judgment in social situations, observation of human be- 
haviour, recognition of the mental state of the speaker, sense of 
humour) will be found in Chapter 25 of Munn’s Psychology. There 
is an illustration in the same chapter of a personality profile, or 
psychograph, which can be drawn after a number of tests of different 
personality traits have been given and scored on a comparable basis. 

Although standardised tests of the kind just described belong to 
the quantitative approach to personality assessment, it 1s clear that 
they can be used qualitatively as a basis for discussion with the 
individual child. 


Qualitative Methods 

Projective Technignes. Projection tests are ways of stimulating 
an individual.to reveal his inner motivations, wishes, fears. The 
technique releases unconscious attitudes. The method is one of 
expression, in contrast to rating, which is a method of impression. 
In projective methods the products of the individual's imagination 
and fantasy are interpreted. How much understanding is gained 
depends largely on the experience and insight of the tester, and the 
method is not one that can be used by the non-specialist. The 
specialist is aware of the need to check the validity of his findings. 

Blot and Picture Interpretation. Two well-known projection tests 


24 MENTAL MEASUREMENT 


are the Rorschach Test and Murray's Thematic Apperception Test 
(known as Т.А.Т.). The Rorschach Test consists of ten sym- 
metrical ink blots, each on a separate card. Five of them are 
coloured. Тһе subjects responses are interpreted according to 
(a) the location (whole or part of the blot first), (b) the determinants 
(shape or form, colour, movement), and (c) content or meaning. 
А vast literature has been published about the test and its results, 
and the Rorschach Institute gives special training in the method. 
The kind of information which can be obtained is shown by the 
following descriptive result obtained by a graduate student of 
psychology, and checked as valid from external evidence: 


“Тһе subject's scores indicate a person of normal intelligence, 
with an unusually active imagination, tending to compulsion, 
with artistic interest, absence of pedantry or meticulousness, a 
touch of stubbornness, although with affective adaptability and 
social and practical interests; a person who is active and 
energetic but has a depressive tendency or feelings of wish- 
fulfilment to the mastery of which a great deal of energy goes 
in the presence of others.” 


One drawback of the Rorschach test is that the scoring and inter- 
pretation are very time-consuming. There is a useful manual by 
Klopfer and Kelley, The Rorschach Technique (Harrap) The 
training, however, is long and specialised, 

Murray's Thematic Apperception Test consists of a standard 
series of pictures so devised that they lend themselves to a variety 
ofinterpretations. Thus the subject is forced to interpret the theme 
of each picture in accordance with his own predilections. Опе 
picture shows a figure which might be either a man: or a woman 
standing beside a pillar which might be either a lamp standard or 
а busstop. Another picture is that of a girl holding on to a door 
andbowingherhead. Onehand covers her face. Munn (Psychology, 
p. 467) quotes four interpretations of this picture, given by students 
in answer to the stimulus, “Tell me what events have led up to the 
present occurrence, what the character in the picture is thinking and 
feeling, and what the outcome will be.” The interpretation is related 
to Murray’s theory of needs and drives. 

Many research workers are experimenting with projection tests. 
There is a Four Picture Projection Test by Van Lennep. Another 


test is the Szondi, which consists of photographs (head,and shoulders) ` 


of people of various occupational and constitutional types. The 
subject is asked to choose the persons whom he likes best and least 
in each group of portraits, and his personality is interpreted accord- 
ing to the pattern of his acceptances and rejections. Dr. Lydia 
Jackson has devised a Family Relations Test in which the pictures 
presented are chosen to suggest specific family situations, such as an 
only child sitting alone while its parents talk together, or the presence 
of a new baby. Such a test is regarded as a controlled" projection 


MENTAL MEASUREMENT 25 


test because the associations which the picture may evoke from the 
child are aroused in connection with specific problems which the 
tester has in mind to explore. This is in contrast to projective 
techniques which use less structured material, such as Plasticine, 
moist sand, finger-paints, or free imaginative play where there is no 
direction given to the child as to topic or point of departure for his 
imaginative expression. Dr. Pickford has experimented with a 
Muttering Test using inchoate verbal material, obtained by playing 
backwards on a sound-machine a passage from a book or a con- 
versation which had been previously recorded. Тһе listener is 
asked to interpret the “speech.” dum. 1 

Story and Sentence Completion. А form of projection test which 
can be used conveniently as an ordinary school exercise is the com- 
pletion of stories or sentences. One of the early examples of this 
type of test was devised by Haggard and Wolff in 1942. Haggard 
asked children the name of their favourite comic papers and their 
hero in the comic strips. Having discussed with each child the 
adventures of the chosen character, Haggard said, “Now let us make 
up new adventures for him." The use of the third person allows 
attitudes to be released which might be withheld owing to embarrass- 
ment, fear, or guilt if sought by direct questioning. 3 

Raven's Controlled Projection Test is intended to tap interests, 
fears, and wishes. Тһе tester secures the child's co-operation in 
telling a story. For example, the tester begins: 


“Опсе there was a boy (or girl) 


1. (а) What did he like doing? 
(b) Whom did he like playing with? 
(c) Whom did he not like being with? 
One day this boy went out with his mother and father and they got 
cross with him. 
2. What was it about?" 
Raven's test for adults on similar lines investigates among other 
things attitudes to work and to the opinions of others. The test can 
be used with pupils of secondary-school age as well as for adults. 
The subject is encouraged to draw anything he wishes while pro- 
ducing his story orally; the intention is to free him from self- 
consciousness and restraint, but the drawing may be included in the 
interpretation. f 
The Duss Story Completion Test (used in Lausanne by Mile Duss, 
and translated by Dr. Frank Bodman) consists of ten little stories 
to be completed, each intended to explore specific problems such as 
aggressiveness, guilt, anxiety, possessive character. For example, 
she tells the following story as a test of the attachment of the child 
to one or other of the parents or of his independence: 
“A father bird, a mother bird, and their little bird slept in a 
nest, on the branch of a tree. One night there was a great 


26 MENTAL MEASUREMENT 


storm and the wind blew the nest out of the tree and it fell to the 
ground. Тһе three birds woke up with a start. Тһе father 
bird quickly flew into a pine tree, the mother bird flew into 
another tree—What do you think the little bird did? He had 
already learnt to fly." 


Mlle М. Rambert at Lausanne uses puppets with which children 
can act plays. This form of projection is suitable for children of 
infant-school age. Mlle Rambert's collection of puppets includes 
figures ofthechild's home (father, mother, nurse, aunt, grandparents), 
teachers, other children, and also puppets to represent abstract 
ideas such as death, the devil, witches, and other possible inhabitants 
of a child's fantasy. 

Sentences to be completed are convenient to administer as part of 
ordinary school work, e.g.: 


John prefers the company of... 

He didn’t like Bill because he was too... 
He made a point оѓ... 

It is embarrassing... 


These examples are quoted from Symonds (Journ. Abn. & Soc. 
Psych., July 1947), who includes sentences in the first person, e.g.: 


My ambition is... 
My greatest worry... 
lenjoy... 


It is not difficult to make up sentences to suit the particular children 
one is called upon to understand. However, caution is needed in 
regard to how far a child is or is not expressing personal attitudes. 
Professor Burt found that useful themes for school compositions 
intended for use as personality indicators were: 


(a) What I should like to be doing in 15 years' time. 
(b) The history of my life. 


Rather outside the field of mental measurement as such, but useful 
as diagnostic instruments, are "Five-minute essays" as used by 
A. S. Neill, or two-minute essays on topics given impromptu. Class 
magazines put together once a month from spontaneous contribu- 
tions prepared in some of the ordinary composition times throw 
much light on character. There needs to be complete ease and 
freedom to contribute anything, e.g. an advertisement requesting 
exchange of stamps or papers, a serial story, an original poem or 
favourite quotations, jokes new or old, film reviews or local news. 
There cannot be demands for correct spelling, comparisons, or 
requirements regarding merit. А testing technique is invalidated if 
used as a teaching technique. Оп the other hand, it has been found 
that the children's interest is so strong that they themselves are 
motivated to improve their way of expressing themselves. Classes 


D 


MENTAL MEASUREMENT 27 


for dull children have, in their efforts for the class magazine, produced 
work which they and others “did not know was in them.” 

Dreams. After the First World War, Dr. Kimmins, school 
inspector on the staff of London County Council, collected some 
5,000 dreams from schoolchildren. The children were asked to 
write about any dream which they had had recently. Тһе task was 
set without preparation during one of the English composition 
lessons. Interesting light was thrown on the attitudes and experience 
of the dreamers, who included blind, deaf, delinquent, and normal 
children of different ages and from different kinds of school. The 
dreams of the younger children, which were short, were written 
down for them by their teacher, to whom they told them individually. 
Ап account of this work is given in the Handbook of Child Psychology, 
edited by Murchison. А point of interest noted by Dr. Kimmins is 
that the children enjoyed the task and surpassed their usual standard 
of work. It has been found that projective techniques have a 
beneficial effect on the person besides stimulating him to give 
information about himself (often without knowing how much he is 
revealing). 

It is clear that projection techniques have been in use by many 
who did not know this name for them. In schools they can be most 
easily given as part of the usual work in the primary department, 
and by teachers of English literature and language, and in the Art 
classes. It does not seem so easy to fit them into the curriculum of 
the mathematics lesson, for instance. What is new is (a) the official 
recognition, by giving the method a name, that an individual reveals 
his wishes and attitudes in his fantasies, and (5) the attempt to sys- 
tematise, to validate, and make reliable the method and results 
obtained. а 

А recent research by Stephen Wiseman concerns a method of 
Securing valid and reliable marking of English composition 1n 
grammar-school selection. 

Social Aspects of Personality. Brief reference to the study of 
Personality from the social point of view shows the following useful 
methods: 


1. Time Budgets. Subjects are asked to keep a note for a certain 
length of time, e.g. a week or a month, of the distribution of their - 
time: - One of the earliest accounts of this kind was The Use of Time 
in Farm Houses (University of Nebraska, 1928). қ 

2. “Psychological Geography." E.g. Moreno's study of the life- 
Space of city children, the areas covered—a school, playground, 
Street, clubs, holiday excursions—and the pattern of their friend- 
Ships. Lewin's Topological Psychology, studies personal attitudes 
towards group memberships (leader, followers, comrade) and 
towards the obstacles (resistances) of the community or environment. 
. 3. Social Maturity Scales are standardised in the same way as 
intelligence tests, 


28 MENTAL MEASUREMENT 


Тһе Vineland Maturity Scale, for example (by Doll), contains the 
following examples: 


Age I-II 
Asks to go to toilet. 
Initials own play activities. 
Removes coat or dress. 
Eats with fork. 
Gets drink unassisted. 
Dries own hands. 
Relates experiences. 


Age ІХ-Х 
Makes minor purchases. 
Goes about home town freely. 


Age XV-XVIII 
Communicates by letter. 
Follows current events. 
Has own spending money. 
Buys all own clothing. 
Goes to nearby places alone. 


Age XXV 
Performs skilled work. 
Advances general welfare. 
Inspires confidence. 
Creates own opportunities. 


Тһе use of such a scale is much more limited because of cultural and 
environmental differences than is a scale of general development or 
maturation such as Gesell's. Also this scale has been found more 
useful in distinguishing differences among mental defectives than 
among normal children. 

Interview. The interview seeks to combine qualitative and 
quantitative methods of assessing personality. Тһе War Office and 
Civil Service selection boards, for example, use test scores, rating, 
methods of expression, and behaviour tests (miniature life situations) 
such as problems set to leaderless groups. Burt has developed 
different forms of interviews for children (Brit. Journ. of Educ. 
Psych., 1947). j 

There is the form of interview where the child does not know that 
he is already being interviewed when he is asked to wait in a room 
where a man apparently casually asks him to help in arranging books 
in a cupboard and so falls into conversation. The conversation, 
although appearing to be casual, is skilfully directed to points on 
which information about the child's personality is desired, Another 


MENTAL MEASUREMENT 29 


type of "interview" takes the form of a party or a visit to the Zoo 
during which certain prearranged "situations" occur. 

A short book by Oldfield, The Psychology of the Interview 
(Methuen), makes recommendations on ways of putting candidates 
at their ease so that they behave іп a natural manner.’ Interviewing 
for vocational guidance is discussed by Macrae in Talents and 
Temperaments. 


CONCLUDING REMARKS 


Ideally, there is no reason why a child whose progress at school is 
good and who is happy at home and school should be given any 
tests, except for research. Such research is important because it 
throws light on the problems of childhood and teaching, and it must 
be done on normal children to find out the conditions of normality, 
just as much as upon backward and neurotic children to find out 
why they are abnormal and what to do to help them. People іп 
general should be discouraged, however, from thinking that every 
childoughtto be tested, just to find out how good he is, or toshow that 
he is better than another child. In practice, however, there is the 
vexed question of competition for places in secondary school, which 
has brought into prominence in recent years the use of mental tests 
in helping to select pupils. It is a moot question whether, while 
there is this necessity to examine at 11+, there should be a pre- 
liminary testing for school records about the age of 7. | 

It should be remembered that, in studying individual differences, 
individual tests of all kinds are the ideal, and group tests are second 
best and should be used only when it is impossible to give the time 
and effort needed for an individual test. Group tests give the 
approximate rank order of children in the group, but have less power 
than an individual test to distinguish, e.g. between children who are 
not markedly different in ability (say, all those within range of 
95-105 I.Q.). Thus they should be used only to help in ranking, and 
not to label a child with a definite figure of І.О. : 

Mental measurement does not seek to reduce the child to mechan- 
ised and standardised quantities. Тһе individual is always unique, 
even though we can make generalisations and quantitative abstrac- 
tions from the study of populations and groups. The scientific 
generalisations are to help us to understand the individual and his 
many and varied probfems. Sound statistical treatment can clarify 
the significance cf our observations, but is worse than useless without 
Sound observation. Measurement of those aspects which we have 
learned to measure is valuable only if we remember that there are 
many aspects which we have not learned to measure. Many of 
them we can only guess at or describe dimly. Continued study of 
Children of all kinds and the relation of our measurements to our 
qualitative experiences will be required before we can have much 
insight into children in general or an individual child. ў 

Lastly, in writing for teachers, parents, and others who are likely 


30 MENTAL MEASUREMENT 


to gain from the development of scientific tests, it must be said that 
on no account should children be coached in tests in the hope that 
this will enable them to do better. Such a course will only defeat 
the aim of these tests and spoil their practical value by giving false 
results. А different point of view has been suggested to meet the 
problem of annual local competitive examinations where it is sus- 
pected that some of the entrants have received coaching in intelli- 
gence tests. The suggestion is that all entrants should receive 
preliminary coaching for about three hours. Experiments have 
shown that diminishing returns set in after that amount of. coaching. 
If this device is adopted, it is all the more necessary to regard the 
resulting scores only as an aid to ranking the particular group, and 
not for estimating the I.Q., since the norms published with the test 
will not be applicable to this situation. 

Those who wish to understand the theoretical side of mental 
measurement, which is outside the scope of this booklet, will find 
much to interest them in Goodenough's Mental Testing, Vernon's 
Measurement of Abilities, Vernon and Parry's Personnel Selection in 
the British Forces, Guilford’s Psychometric Methods, Spearman’s 
Abilities of Man, Thomson’s Factorial Analysis of Human Ability, 
Burt’s Factors of the Mind, and Thurstone’s Multiple Factor Analysis. 


ADDENDUM 


Since these notes on mental measurement were written, there has 
appeared the Wechsler Intelligence Scale for Children (known as the 
WISC). А British standardisation of this test is under way. Time 
will show to what extent this test can be an alternative or a supple- 
ment to the Terman-Merrill scale. 

There is available now a Graded Arithmetic-Mathematics Test 
prepared by Dr. Vernon (University of London Press), which prob- 
ably will replace older tests against which the criticism has been 
levelled that their norms are out of step with present standards of 
attainment in arithmetic. 


BIBLIOGRAPHY 
(Published in London unless otherwise stated) 


ALEXANDER, W. P.: Intelligence, Concrete and Abstract. Cambridge 
University Press, 1935. 

ALLPORT, С. W.: Personality: А Psychological Interpretation. 
Constable, 1949. 

BERNREUTER, R. G.: A Personality Inventory. Stanford University 
Press, 1939. 

Burt, C.: Factors of the Mind. University of London Press, 1940. 

Burt, С.: Mental and Scholastic Tests. Staples Press, 1947. 

Burt, С., and others: Symposium on Personality. British Journal 
of Educational Psychology, Vol. 15, Part 3, 1945; Vol. 16, 1946; 
Vol. 17, Part 1, 1947. Aa 

CATTELL, Raymond B.: A Guide to Mental Testing. University of 
London Press, 1948. 

CATTELL, Raymond B.: Description and Measurement of Personal- 
ity. Harrap, 1946. 5 
Соскетт, R.: Development of а New Test of Manual Dexterity. 

British Journal of Occupational Psychology, Vol. 21, No. 4, 1947. 
EYSENCK, Hans J.: Dimensions of Personality. Kegan Paul, 1947. 
FREEMAN, F. N.: Mental Tests: Their History, Principles and 

Applications. Harrap, 1939. 

GooprNouGH, Florence L.: The Measurement of Intelligence by 

Drawings. New York, World Book Co., 1926. ና 
Сооремоосн, Florence L.: Mental Testing. Мем York, Rinehart, 

1949, 


GREAT BRITAIN. Board of Education: Report of the Consultative 
Committee on Psychological Tests of Educable Capacity. 
H.M.S.O., 1924. 

GuiLronp, J. P.: Psychometric Methods. New York. McGraw- 
Hill, 1936. Қ 
Hem, A. W.: An Attempt to Test High-grade Intelligence. British 

Journal of Psychology, Vol. 37, Part 2, 1947. 

Кіоргек, Bruno, and KELLY, Douglas M.: The Rorschach Tech- 
nique. New York, World Book Co., 1946. | 

Knicut, А. R.; Intelligence and Intelligence Testing. Methuen, 
1948. 

Lennep, D. J. van: Manual for the Four Picture Test. Тһе Hague, 
Nijhoff, 1948. . 

ACRAE, A.: Talents and Temperaments, Nisbet, 1932. 

Munn, Norman L.: Psychology: The Fundamentals of Human 
Adjustment. Boston, Houghton Mifflin, 1946. 

Murcuison, C., editor: A Handbook of Child Psychology. Wor- 
cester, Mass., Clark University Press, 1933. 

31 


32 MENTAL MEASUREMENT 


Моврнү, Gardner: Personality: a Biosocial Approach to Origins 
and Structure. New York, Harper, 1947. Q 

Murray, Henry A.: The Thematic Apperception Test. Cambridge, 
Mass., Harvard University Press, 1943. 

OLDFIELD, Richard C.: The Psychology of the Interview. Methuen, 
1947. 

RomscHACH, H.: Psychodiagnostics. New York, Grune and 
Stratton, 1942. 

ScHONELL, F. 7.: Backwardness in the Basic Subjects. Edinburgh, 
Oliver and Boyd, 1948. 

SPEARMAN, Charles E.: The Abilities of Man. Macmillan, 1927. 

SrUTSMAN, R.: Mental Measurement of Pre-School Children. New 
York, World Book Co., 1931. 

Szonp1, L.: Test. New York, Grune and Stratton, 1937. 

TERMAN, Lewis M., and MERRILL, M. A.: Measuring Intelligence. 
Harrap, 1949, 

THOMSON, С. H.: The Factorial Analysis of Human Ability. Uni- 
versity of London Press, 1948. 

THURSTONE, L. L.: Multiple Factor Analysis. Chicago, University 
of Chicago Press, 1947. 

VALENTINE, C. W.: Intelligence Tests for Children. Methuen, 1948. 

VERNON, P. E.: The Measurement of Abilities. University of Lon- 
don Press, 1940. 

VERNON, P. E., and PARRY, J. B.: Personnel Selection in the British 
Forces. University of London Press, 1949. 

WECHSLER, David: The Measurement of Adult Intelligence. Balti- 
more, Williams and Wilkins, 1944. 

уш; A. T., et al. : Skill and Age. Oxford University Press, 

WISEMAN, S.: The Marking of English Composition in Grammar 
School Selection. British Journal of Educational Psychology, 
Vol. 19, Part 3, 1949. 


GROUP TESTS 

Diagnostic Tests in Reading and Arithmetic, by F. J. SCHONELL. 
Edinburgh, Oliver and Boyd. 

Moray House Tests (Intelligence, English, Arithmetic), by G. H. 
THOMSON. University of London Press. 

National Intelligence Tests, by M. E. HAGGERTYand others. Harrap. 

Non-Verbal Intelligence Tests, by С. Е. Steicut. Harrap. 

Northumberland Tests (Intelligence, English, Arithmetic), by C. BURT. 
University of London Press. 

Otis Tests, by А. S. Oris. Harrap. К 

Progressive Matrices, by 1. C. RAVEN. Lewis. 

Simplex Intelligence Tests, by C. A. RICHARDSON. Harrap. 

Southend Tests (Intelligence, Arithmetic), by M. E. Нил. Harrap. 


PW S 
‚7 ቴ”- LZ 


{ 


UNIVERSITY OF BRISTOL . 
INSTITUTE OF EDUCATION 


ЕЕ 
YOUTH COUNCILS. 
^ No2 
TRAINING FOR FULL-TIME 
YOUTH LEADERSHIP 
No 3 е 
STUDIES IN SELECTION TECHNIQUES 
FOR ADMISSION TO 
GRAMMAR SCHOOLS 


UNIVERSITY OF LONDON PRESS Lro | 
WARWICK SQUARE, LONDON, 5.ር.4 


аза 


"£5 


